Skip to content

vpage / vregion: a minimal version with a test#10636

Open
lyakh wants to merge 3 commits intothesofproject:mainfrom
lyakh:vreg
Open

vpage / vregion: a minimal version with a test#10636
lyakh wants to merge 3 commits intothesofproject:mainfrom
lyakh:vreg

Conversation

@lyakh
Copy link
Collaborator

@lyakh lyakh commented Mar 20, 2026

This is a slightly simplified version of #10281 with an added test

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a minimal “virtual pages” allocator and “virtual regions” (partitioned into interim + lifetime allocators) for SOF on Zephyr/ACE, along with a basic boot test and the required Kconfig/build/board configuration plumbing.

Changes:

  • Add vpage contiguous virtual-page allocator and vregion allocator built on top of it.
  • Wire new sources and configs into Zephyr build + Kconfig (including new virtual region attribute).
  • Add a Ztest-based boot test for vregion, and bump ACE board virtual region count.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
zephyr/test/vregion.c Adds a basic vregion allocator test under boot tests.
zephyr/test/CMakeLists.txt Builds the new test when CONFIG_SOF_VREGIONS is enabled.
zephyr/lib/vregion.c Implements vregion (interim k_heap + lifetime linear allocator).
zephyr/lib/vpages.c Implements vpage allocator and registers its init via SYS_INIT.
zephyr/lib/alloc.c Adjusts virtual heap bundle sizing/comments to match new region sizing.
zephyr/include/sof/lib/vregion.h Public API for vregion.
zephyr/include/sof/lib/vpages.h Public API for vpage.
zephyr/include/sof/lib/regions_mm.h Adds VIRTUAL_REGION_VPAGES_ATTR.
zephyr/Kconfig Adds SOF_VREGIONS, SOF_VPAGE_MAX_ALLOCS, SOF_ZEPHYR_VIRTUAL_PAGE_REGION_SIZE, and updates heap region sizing defaults/wording.
zephyr/CMakeLists.txt Builds vpages.c/vregion.c when CONFIG_SOF_VREGIONS is enabled.
app/boards/intel_adsp_ace40_nvls.conf Increases virtual region count to 3 for VPAGES region.
app/boards/intel_adsp_ace40_nvl.conf Increases virtual region count to 3 for VPAGES region.
app/boards/intel_adsp_ace30_wcl.conf Increases virtual region count to 3 for VPAGES region.
app/boards/intel_adsp_ace30_ptl.conf Increases virtual region count to 3 for VPAGES region.
app/boards/intel_adsp_ace20_lnl.conf Increases virtual region count to 3 for VPAGES region.
app/boards/intel_adsp_ace15_mtpm.conf Increases virtual region count to 3 for VPAGES region.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +155 to +158
LOG_INF("new at base %p size %#x pages %d struct embedded at %p",
(void *)vr->base, total_size, pages, (void *)vr);
LOG_DBG(" interim size %#x at %p", interim_size, (void *)vr->interim.heap.heap.init_mem);
LOG_DBG(" lifetime size %#x at %p", lifetime_size, (void *)vr->lifetime.base);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LOG_INF/LOG_DBG calls here print total_size/interim_size/lifetime_size using %#x, but these are size_t. On 64-bit builds this can truncate and can fail with -Wformat. Prefer %zu or %#zx for sizes.

Copilot uses AI. Check for mistakes.
Comment on lines +174 to +175
LOG_DBG("destroy %p size %#x pages %d", (void *)vr->base, vr->size, vr->pages);
LOG_DBG(" lifetime used %d free count %d", vr->lifetime.used, vr->lifetime.free_count);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vregion_destroy() logs vr->size and vr->lifetime.used with %#x/%d, but these are size_t. Please update these format strings to the correct size_t specifiers to avoid -Wformat issues.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One staging related comment inline. Otherwise looks good and addition of a small test case is great.

zephyr/Kconfig Outdated
hex "Size in bytes of virtual memory region for virtual heap shared for all cores"
depends on MM_DRV_INTEL_ADSP_MTL_TLB
default 0x100000
default 0xc0000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any downsides to this change when done now? I mean this is enable the feature in many builds by default, but the vregions are not actually used yet. I wonder if these Kconfig changes should be done in the PR when vregion is actually used in pipelines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i it's used by the test!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh And test is run at boot...? Hmm, right, so maybe ok then. With my recent user-space PR, I've kept the new tests standalone so I don't have to enable new features in Kconfig by default yet. OTOH, the major downside is these tests are not run by any CI then....

Copy link
Collaborator

@softwarecki softwarecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the comments in this review were carried over from the previous review. Unfortunately, the code was moved to a new PR, which required re-reviewing the entire change again. Again, the vpage_ctx.velems is a large array (SOF_VPAGE_MAX_ALLOCS = 128). It would be worth optimizing add and remove operations to avoid repeatedly scanning it linearly. Really please consider moving page mapping logic into vregion. For lifetime allocators, mapping pages progressively as needed could help reduce memory usage.

vpage_ctx.num_elems_in_use = 0;

/* bundles are uint32_t of bits */
bitmap_num_bundles = SOF_DIV_ROUND_UP(block_count, 32);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

32 => sizeof(uint32_t)? Bellow you use sizeof(uint32_t).

* This allocator manages the allocation and deallocation of virtual memory pages from
* a predefined virtual memory region which is larger than the physical memory region.
*
* Both memory regions are divided into 4kB pages that are represented as blocks in a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we really hard-specify the 4kB size?

Comment on lines +125 to +135
for (i = 0; i < VPAGE_MAX_ALLOCS; i++) {
if (vpage_ctx.velems[i].pages == 0) {
vpage_ctx.velems[i].pages = pages;
vpage_ctx.velems[i].vpage =
(POINTER_TO_UINT(vaddr) -
POINTER_TO_UINT(vpage_ctx.virtual_region->addr)) /
CONFIG_MM_DRV_PAGE_SIZE;
vpage_ctx.num_elems_in_use++;
break;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To eliminate the loop during page allocation, I suggest:

vpage_ctx.velems[page_context.num_elems_in_use].pages = pages;
vpage_ctx.velems[page_context.num_elems_in_use].vpage =
				(POINTER_TO_UINT(vaddr) -
				POINTER_TO_UINT(vpage_ctx.virtual_region->addr)) /
				CONFIG_MM_DRV_PAGE_SIZE;
vpage_ctx.num_elems_in_use++;

Especially since this array is large by default: SOF_VPAGE_MAX_ALLOCS = 128

Copy link
Collaborator Author

@lyakh lyakh Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or use a list. This can be a later optimisation

Copy link
Collaborator

@softwarecki softwarecki Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets assume SOF_VPAGE_MAX_ALLOCS = 8.

now imagine, that somebody allocates SOF_VPAGE_MAX_ALLOCS / 2 elements

velems[0] = alloc1
velems[1] = alloc2
velems[2] = alloc3
velems[3] = alloc4
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = free

then frees all but the last

Lets free alloc1:

velems[0] = alloc4 // Move last element to position of the element being released
velems[1] = alloc2
velems[2] = alloc3
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = free

then alloc2:

velems[0] = alloc4
velems[1] = alloc3 // Move last element to position of the element being released
velems[2] = free
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = free

then alloc3:

velems[0] = alloc4  // Move last element to position of the element being released
velems[1] = free
velems[2] = free
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = free

Then allocates another SOF_VPAGE_MAX_ALLOCS / 2 elements

velems[0] = alloc4
velems[1] = alloc5
velems[2] = alloc6
velems[3] = alloc7
velems[4] = alloc8
velems[5] = free
velems[6] = free
velems[7] = free

So we have filled velems[0] .. velems[4]. What is the problem?

You always know the position of the last element (num_elems_in_use - 1) so you replace the released element with it and clear the last one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could completely remove this table and rely on two bitarrays. One bitarray would mark that a given page is the first allocated page. The second one would encode allocation size - either by setting a bit on the last page of the allocated block or, inversely, by marking that the next page still belongs to the same allocation. This approach is currently used in vmh. Allocation size calculation there requires heavy optimization, but it is a good starting point to present the idea. With proper optimization, we could check continuity of 32 pages at once, and an ffs operation could return the index of the last page.

sof/zephyr/lib/regions_mm.c

Lines 459 to 496 in 1f18231

/* Mechanism of saving allocation size is a bit complicated and genius
* thought up by Adrian Warecki.
* This explanation is long and will be moved to documentation afterwards.
* Initial setup for mechanism:
* First bit array that saves allocated blocks
* in our system this will be bit array saved in mem_blocks API.
* Second bit array which is of exactly same size of first one
* that will save connection of allocation to size of allocation
* by marking if the bit of allocation is connected to next bit.
*
* Examples: lets have simple 16 bit representation of blocks
* First: 0000000000000000 allocate first four blocks 1111000000000000
* Second:0000000000000000 1110000000000000
* The allocator saves whether block was allocated and second bit array
* saves information by marking in position of a bit allocated 1 if
* it is a start of the allocation and has more than one
* block allocated, 1 if following bit is also part of allocation.
* 0 for individually allocated block and 0 for block ending allocation
* Continuing the example above
* First: 1111100000000000 allocate another block 1111100000000000
* Second:1110000000000000 after the first one 1110000000000000
* based on information above we know exactly how long allocations were
* by checking second bit array against first one
* First: 1111111100000000 allocate another block 1111111100000000
* Second:1110000000000000 and another two blocks 1110001000000000
* We are still able to pinpoint allocations size
*/
/* Set bits count - 1 on the same offset as in allocation bit array */
allocation_bitarray_offset = (uintptr_t)ptr
- (uintptr_t)heap->physical_blocks_allocators
[mem_block_iterator]->buffer;
allocation_bitarray_position = allocation_bitarray_offset / block_size;
/* Need to set bits only if we have more than 1 block to allocate */
if (block_count - 1)
sys_bitarray_set_region(heap->allocation_sizes[mem_block_iterator],
block_count - 1, allocation_bitarray_position);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have filled velems[0] .. velems[4]. What is the problem?

@softwarecki I understand your proposal, I didn't say, that there was a problem, I just suggested to make that a later optimisation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now SOF_VPAGE_MAX_ALLOCS / 2 - 1 elements are free, but you'll fail any attempt to allocate memory again - as long as the last element is kept.

I didn't say, that there was a problem

👀 Okeyyyy...

Using a list doesn't bring a performance benefit compared to the current array-based approach but could consume more memory due to additional link pointers in each list element.

Optimizations for adding and removing elements from the array were already proposed in a previous review. They are straightforward to implement and can be applied immediately. I even provided example code in the comments. An optimization based on a bitarray does require more work and could reasonably be planned as a next step.

Comment on lines +137 to +144
if (i == VPAGE_MAX_ALLOCS) {
/*
* The caller is holding a lock, so someone must have changed
* vpages state without taking it
*/
LOG_ERR("data corruption, check for races");
return -EPROTO;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like AI slop. We already check for allocation elements, so we don't expect to find no free entry in this array. Allocations are protected by mutex so there is no "someone".

/* check for valid ptr which must be page aligned */
if (!IS_ALIGNED(ptr, CONFIG_MM_DRV_PAGE_SIZE)) {
LOG_ERR("error: invalid non aligned page pointer %p", ptr);
return -EINVAL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k_panic()? This seems like a serious problem and we should not allow it to leak memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not here certainly. Anybody can call vpage_free() with a wrong pointer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a wrong pointer sounds like a serious problem...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a wrong pointer sounds like a serious problem...

in the calling code, yes. But not here. I think a panic is more for internal inconsistencies. Passing an invalid argument to a function doesn't sound like one to me. But I suppose we can find different cases when it's used in SOF. So, it isn't something I have a strong opinion about, but unless there are good arguments, I'd keep it this way.

* Allocate aligned memory from the specified virtual region based on the memory type.
*
* @param[in] vr Pointer to the virtual region instance.
* @param[in] type Type of memory to allocate (static, dynamic, or shared static).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

};

/* zephyr k_heap for interim allocations. TODO: make lockless for improved performance */
struct zephyr_heap {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zephyr_heap? A vague and misleading name. It suggests a Zephyr origin.

* information about the base address, size, and allocation status
* of the region.
*
* Currently the virtual region memory region can be partitioned into two areas
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one "region" too many?

if (align == CONFIG_DCACHE_LINE_SIZE)
size = ALIGN_UP(size, CONFIG_DCACHE_LINE_SIZE);

/* calculate new heap object size for object and alignments */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the comment could be clarified?

return;
}

LOG_ERR("error: vregion free invalid pointer %p", ptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k_panic()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be a DOS invitation

Copy link
Collaborator

@softwarecki softwarecki Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To execute DOS, all is need for a userspace thread is to reference an invalid address or divide by 0, and the firmware crashes. There's no exception recovery in Zephyr. Such panic will allow to immediately detect a bug instead of slowly losing available memory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @softwarecki that's not correct. Zephyr can recover, but we have just not implemented this yet in SOF. See https://docs.zephyrproject.org/latest/kernel/services/other/fatal.html#fatal-error-handling
I do agree the primary goal is to protect memory and data. Ability to recover from misbehaving user-space code is an additional step, but Zephyr does provide support to implement this as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so lets use assert to detect it in debug builds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so lets use assert to detect it in debug builds.

ok, I checked sys_heap_free() in Zephyr and it does use __ASSERT() when checking (some) invalid addresses, so we could do something similar. But what should it do if assertions are disabled? Just proceed and crash somewhere randomly later? Or use both an assertion and a return? I think we can use CHECKIF() and enable CONFIG_ASSERT_ON_ERRORS in our debug overlay

Copy link
Collaborator

@softwarecki softwarecki Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what should it do if assertions are disabled? Just proceed and crash somewhere randomly later?

At the moment in the code you proposed there is only LOG_ERR, so that's exactly what's happening.

Or use both an assertion and a return?

Yes, we can use __ASSERT() and LOG_ERR(). What return?

lrgirdwo and others added 3 commits March 27, 2026 10:50
Add a simple virtual page allocator that uses the Zephyr memory mapping
infrastructure to allocate pages from a virtual region and map them
to physical pages.

Due to simplicity, the number of allocations is limited to
CONFIG_SOF_VPAGE_MAX_ALLOCS

Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Add support for per pipeline and per module virtual memory regions.

The intention is to provide a single virtual memory region per pipeline or
per DP module that can simplify module/pipeline memory management.

The region takes advantage of the way pipeline and modules are constructed,
destroyed and used during their lifetimes.

1) memory tracking - 1 pointer/size per pipeline or DP module.
2) memory read/write/execute permissions
3) cache utilization.

Modules and pipelines will allocate from their region only and this
will be abstracted via the allocation APIs.

Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Add boot-time tests for basic vpage and vregion functionality.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants